Proximal Policy Optimization (PPO) :
Proximal Policy Optimization (PPO) is a reinforcement learning algorithm used to train deep neural networks to learn policies in complex environments. Resources for learning more about PPO include the original paper, an implementation in the OpenAI Baselines library, a tutorial, a video lecture, a blog post, and a paper on PPO for multi-task learning.